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Low frequency gravitational wave detectors, sucli as tfie Laser Interferometer Space Antenna 
(LISA), will have to contend with large foregrounds produced by millions of compact galactic bi- 
naries in our galaxy. While these galactic signals are interesting in their own right, the unresolved 
component can obscure other sources. The science yield for the LISA mission can be improved if 
the brighter and more isolated foreground sources can be identified and regressed from the data. 
Since the signals overlap with one another we are faced with a "cocktail party" problem of picking 
out individual conversations in a crowded room. Here we present and implement an end-to-end 
solution to the galactic foreground problem that is able to resolve tens of thousands of sources from 
across the LISA band. Our algorithm employs a variant of the Markov Chain Monte Carlo (MCMC) 
method, which we call the Blocked Annealed Metropolis-Hastings (BAM) algorithm. Following a 
description of the algorithm and its implementation, we give several examples ranging from searches 
for a single source to searches for hundreds of overlapping sources. Our examples include data sets 
from the first round of Mock LISA Data Challenges. 



I. INTRODUCTION 

Galactic compact binary systems are expected to be 
the major source of gravitational waves detected by the 
LISA observatory Tens of millions of such bina- 

ries will be emitting gravitational waves in the LISA 
band [1, H, 0, H, @ . Though most will be have a signal-to- 
' noise ratio (SNR) too low to be detectable, tens of thou- 
sands are expected to be resolvable if optimal signal anal- 
■ ysis algorithms are available Q. The signals from unre- 
solved binary systems will constitute a gravitational wave 
confusion noise. Identification of the brighter galactic bi- 
naries will be an important aid to resolving other sources 
for LISA; such as supermassive black hole binaries (SMB- 
HBs) [8, 9, 1(J, [111, [l^ and extreme mass ratio inspirals 
of compact objects into supermassive black holes (EM- 
; RIs) [11 [H, [il, [il]. The SMBHB, EMRI and compact 
binary signals have small, but non- vanishing overlap with 
one another, so we will ultimately want a simultaneous 
fit to all sources and source types. 

Various techniques have been proposed to extract 
the parameters of sources from the LISA data stream. 
The methods include Markov Chain Monte Carlo Meth- 
ods [13, [3 19, genetic algorithms [12], iterative 
methods |23l . [24 1 , grid-based template searching , to- 
mographic reconstruction (26j and time-frequency meth- 
ods [27|. For most of these methods (tomography and 
time- frequency analysis being the exceptions) optimal fil- 
tering is accomplished through the construction of tem- 
plates describing the signals from all sources in the data 
stream. For LISA, the vast number of sources involved 
makes a direct approach (such as a grid-based template 
bank) computationally impractical. It is for this reason 
ergodic methods such as MCMC and genetic algorithms 
have been applied to the LISA data analysis problem. 

In this work we develop an extension of the MCMC 
method 28, 29, 30] that is able to search the entire LISA 
band and simultaneously solve for thousands of galac- 
tic binaries. Our approach is based on the observation 



that while some signals can have significant overlap, sig- 
nals that are well separated in frequency have little or 
no overlap. We exploit this quasi-locality by breaking 
the search up into sub-regions in frequency, taking care 
with edge effects. In a departure from our previous ap- 
proach [ITj ]. we do not try and update all the source pa- 
rameters simultaneously at each iteration. This greatly 
reduces the computational cost, while still providing a 
global solution as the full meta-template comprising all 
the search templates is used to evaluate the likelihood. 
The parameter updates are now done in small blocks of 
highly correlated parameters, and the solution is updated 
using Metropolis-Hastings sampling. Simulated anneal- 
ing is employed during the search phase to improve the 
mixing of the chains. We demonstrate this "Blocked An- 
nealed Metropolis-Hastings" (BAM) algorithm on simu- 
lated LISA data that contains the signals from monochro- 
matic white dwarf binary systems (WD- WD) immersed 
in Gaussian instrumental noise. 

The paper is organized as follows: In Section [H] we re- 
view the MCMC algorithm and the Metropolis-Hastings 
sampling kernel. Additionally, we give a description of 
the BAM algorithm and demonstrate how its compu- 
tational cost scales with the number of sources being 
searched for. Section [HT] discusses various aspects of us- 
ing the BAM algorithm, such as hierarchical searching, 
optimal choices for search ranges, stopping criteria, and 
rates of occurrence for false positives and false negatives. 
Example searches are performed in Section IIVI ranging 
from individual sources to hundreds of sources in a re- 
stricted range of the LISA band. Concluding remarks 
are made in Section IVl 



II. ALGORITHM OVERVIEW 

In this section we describe the BAM algorithm and the 
issues and difficulties that we worked through in the de- 
velopment process. We start with a brief review of the 
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basic MCMC approach and Metropolis-Hastings impor- 
tance sampling. This is followed by a review of how the 
extrinsic source parameters are analytically solved for us- 
ing the F-Statistic. We then describe how the speed of 
the BAM algorithm scales with the number of sources 
that are being searched for. 



A. The Markov Chain Monte Carlo Algorithm and 
Metropolis-Hastings Sampling 

The MCMC algorithm is becoming a familiar tool in 
gravitational wave data analysis. Initially introduced 
to the field by Christensen and Meyer [3l|, its applica- 
tion to ground-based interferometers has been explored 
in the context of parameter extraction of coalescing bi- 
naries 32] and spinning neutron stars [s^. With its abil- 
ity to explore large parameter spaces while simultane- 
ously performing model selection and noise estimation, 
the MCMC method is ideally suited to the LISA data 
analysis problem. The Reverse Jump MCMC algorithm 
has been applied to the LISA-likc problem of identifying 
a large, yet unknown number of sinusoids in simulated 
Gaussian noise [s^, [sE] • It was shown that the method 
could correctly identify the number of resolvable signals 
present in the data and recover the signal parameters 
and an estimate of the noise level. The MCMC approach 
was first applied to simulated LISA data in the context 
of galactic binaries [l7l|. and it has since been applied to 
SMBHBs [ll,[il|2lf7^ 

In using an MCMC approach one wants to generate 
a sample set, {x} that corresponds to draws made from 
the posterior distribution of the system, p{X\s). The al- 
gorithm to develop such a set is surprisingly simple. 

We begin at a point in the parameter space of the 
binary system(s), x, (which may or may not be chosen at 
random) and propose a jump to a new position, y, based 
on some proposal distribution, q{-\x). The Hastings ratio 
is calculated using. 



H = 
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where p{x) is the prior of the parameters at x, q{x\y) is 
the value of the proposal distribution for a jump from x 
to y, and p{s\x) is the likelihood at x. The likelihood 
function, if the noise is a normal process with zero mean, 
is given by [s^ 



p{s\x) = C exp 



1 



{{s - hixMs - h{x))) 



(2) 



where the normalization constant C is independent of 
the signal, s, and (a|6) denotes the noise weighted inner 
product 
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where a and b are the gravitational waveforms, and Snif) 
is the one-sided noise spectral density. The jump will be 
accepted with probability a = min(l,i/). If the jump 
is rejected (Metropolis rejection [28]) the chain remains 
at its current state, x. Repeated jumps will produce a 
Markov chain whose stationary distribution is equal to 
the posterior distribution in question, p(x\s). Andrieu et 
al |34l . 37] provide a more general and thorough review 
of MCMC methods. 

The convergence to the correct posterior will occur for 
any (non-trivial) proposal distribution ^38] . However, the 
more accurate the proposal distribution is at modeling 
the posterior distribution the quicker the chain will con- 
vergence. Since we do not know the form of the posterior 
in advance of running the chain, we instead opt for maxi- 
mum flexibility in our choice of proposal distribution. We 
accomplish this by using a mixture of proposal distribu- 
tions, including occasional "bold" proposals that attempt 
large changes in the parameter values along with many 
"timid" proposals that attempt small changes in the pa- 
rameter values of the chain (for a detailed description 
of some of these proposals see [l3|)- Also added to our 
list of proposals are a few "tailored" proposals. These 
are proposed jumps based on our knowledge of the sym- 
metries and degeneracies of the likelihood surface. For 
example, for LISA there is a secondary maxima in the 
likelihood surface of a binary system that corresponds to 
a reflection about the ecliptic equator combined with a 
shift of TT in the ecliptic longitude. Thus, we include a 
proposal that attempts such a reflection. 

Another tailored proposal that was highly effective 
uses the fact that local maxima occur in the likeli- 
hood surface at multiples of the modulation frequency 
(/m — 1 /year) . The presence of these maxima are easily 
understood. The detector response imparts sidebands on 
the monochromatic Barycenter signal to produce a comb 
whose teeth are spaced by the modulation frequency. A 
template with a Barycenter frequency offset from the sig- 
nal by a multiple of fm will produce a similar comb whose 
teeth align with those of the signal. The fit can be im- 
proved by adjusting the other template parameters to 
better match the shape of the source comb. Figure [T] 
shows the power spectrum of a source and a template 
offset by Ifm in frequency, while in Figure [2] the tem- 
plate has had its other parameters altered to better fit 
the source comb. This improves the log likelihood from 
16.7% to 66.4% of the true parameter value. It proved 
to be difficult to find an analytic description of how the 
other parameters should be modified to improve the fit, 
so we went with a simple proposal that shifted the fre- 
quency of the source by 1/m + e, and shifted the other 
parameters using a uniform draw in a small range around 
the current binary parameters. 

Since the search used an F-statistic to extremize over 
the amplitude, inclination angle, polarization angle and 
initial phase, we only needed to assign priors to the fre- 
quency and sky location. For the frequency we used a 
uniform prior across the search range, and for the sky lo- 
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cations we used a distance weighted galactic distribution. 
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FIG. 1: Spectrum of a source and template with the same pa- 
rameters save their frequencies, which differ by one modula- 
tion frequency. The source has / — S.OmHz, A — 1.4 x 10"^^, 
6 = 1.0, = 3.14159, ip = 0.5, t = 0.785398, and ipo = 0.0. 
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FIG. 2: Spectrum of a sources and a template with frequencies 
that differ by one modulation frequency. The parameters of 
the template have been adjusted to maximize the overlap. 
The source has the same parameters as in Figure 1, while the 
template has A = 1.47254 x 10"^^ / = 3.000031688mHz, 
6 = 0.787087, </> = 3.18781, ip = 0.51218, t = 0.89419, and 
ipo = 0.179034. 

While a general MCMC algorithm functions as both a 
means to search the likelihood surface and sample from 
the posterior distribution, the Markovian nature is only 
required for the sampling portion of the process. If one 
is able to find the neighborhood of the true parameters 
by other means, and then implements an MCMC algo- 
rithm from that point, the subsequent samples will re- 
cover the correct posterior distribution. It is with this 
in mind that we chose to relax the requirement that 
all of our proposal distributions be Markovian. The al- 
gorithm described above remains the same, but is now 



open to non-Markovian proposal distributions. We re- 
fer to this non-MCMC approach as Metropolis-Hastings 
importance sampling. One example of a non-Markovian 
proposal involves how we implement the previously de- 
scribed Ifm jump. Early in the search phase, if one of 
these proposals was accepted, a second (identical) jump 
was attempted. The reason for this is that just as a 
local maximum occurs '~ Ifm away from the global max- 
imum, another (smaller) local maximum occurs ^ 2/m 
away. In fact, a chain of maxima occur spaced about one 
modulation frequency apart from each other in likelihood 
space (see Figure [3]). This non-Markovian move allowed 
searchers to "island-hop" around the likelihood surface if 
it found itself on one of the maxima in the island chain. 

For the purpose of searching to determine the neigh- 
borhood of the true parameters, which is the focus of 
this work, we allow ourselves the extra freedom provided 
by non-Markovian Metropolis-Hastings importance sam- 
pling and simulated annealing (described below). Once 
the search phase is complete the non-Markovian moves 
are suspended, and the algorithm performs a standard 
MCMC exploration of the posterior distribution. 




FIG. 3: Contours of constant likelihood for a single, 
monochromatic WD- WD binary system. Sky position direc- 
tions lie in the plane shown, while frequency is directed or- 
thogonal to the plane. The visible maxima are separated by 
approximately one modulation frequency. 

To encourage the chains to explore the full parameter 
space and to discourage the chains from getting stuck 
a local maxima we employ simulated annealing during 
the search phase. This is done by multiplying the noise- 
weighted inner product by an "inverse temperature" f3, 
and applying the cooling schedule 

I 1 t>Nc 

Here (3o is the initial heating factor and Nc is the length 
of the annealing phase. The choice of /3o depends on 
the SNR of the sources. When very bright sources are 
present we set (Jq ^ 10~^, while smaller values of (Jq ~ 
10~^ work better once the brightest sources have been 
removed. Our criteria for choosing (3q was that the chains 
should explore the full parameter range during the first 
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1000 or so iterations. If full range movement was not 
seen we dialed up the heat. The choice of N^. depends on 
/3o, and can be set by demanding that it takes, say, 500 
steps for (3 to decrease by a factor of two. 



B. F-Statistic 

The F-statistic [3^] uses multiple linear filters to ob- 
tain the extremum for the likelihood over the extrinsic 
parameters of the signal. Using the F-statistic one can 
search the intrinsic parameters, and recover the extrinsic 
parameters as a last step in the process. For a monochro- 
matic binary detected by LISA, this reduces the search 
space to three parameters: frequency and sky location 
{0, 4>)- Note: the tumbling orbit of LISA induces mod- 
ulations of the frequency into the signal, thus creating 
and interdependency in the signal between frequency and 
sky location that prevents 9 and from being treated as 
purely extrinsic quantities. 

At low-frequencies in the LISA response to a gravita- 
tional wave can be written: 



h{t) = h+{t)F+{t) + h^{t)F^^{t), 



(5) 



where h^{t) and (t) are the two polarizations of the in- 
cident gravitational wave [4l|, [4^ , which for a monochro- 
matic binary, to leading post-Newtonian order, are given 
by: 

h+{t) = A{1 + cos'^ t)cos{^{t) + ipo) 

hx{t) = -2Acost sin($(t) + (y3o). (6) 

F~^{t) and F^{t) are the beam pattern factors: 



F+{t) = -{cos2^D+{t) -sini^D'^it)) 
F'^it) = ^{sm2^|;D+{t) +cos2iPD''{t)) (7) 

where D^{t) and D^{t) are the detector pattern func- 
tions, whose mathematical form can be found in equa- 
tions (36) and (37) of RetQ. 
In the gravitational wave phase 

<S>{t-J,e,(j}) = 27r/< + 27r/AUsin6'cos(27r/™t-0), (8) 

one can see the coupling of the sky location and the fre- 
quency through the term that depends on the radius of 
LISA'S orbit, 1 AU, and its orbital modulation frequency, 
fm — 1/year. For the low frequency galactic sources we 
are considering, the gravitational wave amplitude. A, is 
effectively constant. Thus (O can be rearranged as: 



/i(i) =^a,(A,V,i,¥?o)^'(i;/ 



1 " 1 fj •) 



(9) 



where the time-independent amplitudes are given by 
fli 



— - ((1 + cos^ l) cos (^0 cos 2^1^ — 2 cos l sin ip^ sin 2-0) , 



02 = ^ (2 COS L sin (/3o cos 2'(/; + (1 + cos^ t) cos i/jq sin 2il?j ^ 

03 = ^ (2 cos L cos <y9o sin 2'(/; -I- ( 1 -f cos^ t) sin (^o cos 2 V') , 

04 = — ((1 + cos^ t) sinc/jQ sin2f/; — 2 cos t cos(/3o cos2?/;) , 

(10) 

and the time-dependent functions A^{t) are given by 



A' it) 


= D+{t-9, 


0) cos$(t;/,0,0) 




A\t) 


= D''{t;0, 


0)cos$(i;/,0,(/.) 




A' it) 


= D+{t;9, 


4>) sin<i>(i;/,0,</.) 




A\t) 


= D'{t-e, 


0)sin$(i;/,0,0). 


(11) 



Writing LISA'S signal as a superposition of gravitational 
waves and noise, 



Sa{t) = hait,X) +na{t) 



N 

E 



hl^{t,X,) + na,{t), (12) 



and defining the four constants iV* = (s|^') and the M- 
matrix, AP^ — {A^\A^), one can express (O as a matrix 
equation: 



My-o^ = Ni 



(13) 



Therefore, with a signal from LISA and the three in- 
trinsic parameter values, one can solve (by iteration or 
inversion) for the the amplitudes, and thus the extrin- 
sic parameters. The values of the intrinsic parameters 
are found by use of iterative Metropolis-Hastings impor- 
tance sampling. The extrinsic parameter values are given 
by: 



A 



Al - Al 



1 ^ [ A+04-AxOi 
— arctan 
2 



(Axa2 + ^+03), 

-Ax 



arccos 



1^0 = arctan 



A+ + .^Al-Al 

c(A+04 — AxCLl) 

-c(ylxa3 + ^+02)^ 



(14) 



where 



A+ = \/ (oi + 04)2 + (02 - azY 

+ V («i ^ «4)^ + (02 + 03)^ 
Ax = V (ai + ^4)^ + (02 - 03)^ 

-V («1 - «4)^ + (02 + 03)^ 

c = sign(sin(2V-')) . 



(15) 
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This description of the F-statistic automatically incor- 
porates the two independent LISA channels through the 
use of the dual-channel noise weighted inner product: 

We generalize the F-statistic to handle N overlapping 
sources by writing i — AK + Z, where K labels the source 
and Z = 1 — > 4 labels the four filters for each source. The 
F-statistic has the same form, but with 4iV linear filters 
A^*, and M*^ is a 47V x 4iV dimensional matrix. For slowly 
evolving galactic binaries, which dominate the confusion 
problem, the limited bandwidth of each individual signal 
means that the Af*-' is band diagonal, which lessens the 
difficulty in solving (jl3p for the large numbers of sources 
that are expected to be detected. 



C. The Blocked Annealed Metropolis-Hastings 
Algorithm 

A Gibbs Sampler [il] is a special case of the 
Metropolis-Hastings algorithm described above in which 
each parameter, in the set of all parameters, is updated 
sequentially using a proposal distribution that is the con- 
ditional of the posterior evaluated at the current values of 
the other parameters. The mixing rate can be improved 
by simultaneously updating blocks of highly correlated 
parameters, to give what is know as the Blocked-Gibbs 
algorithm [40|. Since it can be difficult to evaluate the 
full conditionals required for Gibbs sampling, we decided 
to stick to Metropolis-Hastings sampling. 

The blocks in the BAM algorithm are small sub-units 
of the frequency range being searched. As can be seen 
in Figure 31 which shows a schematic representation of a 
search region in our BAM algorithm, the search region is 
broken up into equal sized blocks. The algorithm steps 
through these blocks sequentially, updating all sources 
within a given block simultaneously. After all blocks have 
been updated, they are shifted by one-half the width of 
a block for the next round of updates. This allows two 
correlated sources that might happen to be located on op- 
posite sides of a border between two neighboring blocks 
to be updated together on every other update. This 
blocking provides a means to handle highly correlated 
searchers/sources, and provides an even greater benefit 
which is covered in IH Dl 

This simple extension to an MCMC algorithm allowed 
for quick and robust searching of isolated data snippets 
with up to ~ 100 templates (the limit on the number of 
templates is due to the computational cost of the multi- 
source F-statistic). In order to be able to handle the 
entire LISA band, we have to break the search up into 
sub-regions containing a manageable number of sources. 
This introduced the problem of edge effects from sources 
whose frequency lay just outside the chosen search re- 
gion, but deposited power into the search region. To 
combat the edge effects we introduced "wings" and an 
"acceptance window." The purpose of the wings is to 
create a buffer between the chosen search region and the 
regions beyond, so that sources in those outer regions 
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FIG. 4: A schematic representation of a search region in fre- 
quency space, showing the block structure of some the BAM 
algorithm. 



would not adversely affect the searchers in the accep- 
tance window. The search would be run the same as 
before, but at the end only those searchers inside the 
acceptance window would be considered as having been 
found. Searchers that ended up in the wings were dis- 
carded (even though many might be perfectly good fits 
to actual sources) . This would allow us step through fre- 
quency space, using multiple search regions. But that 
was not enough. 

A problem, which we call "slamming," can occur when 
a bright source hes just outside the range of the cho- 
sen search region. Slamming is the tendency of the 
searchers to migrate to the edge of the search region as 
the searchers try to fit to the portion of the power from 
the source outside the search region which is bleeding 
into the search region. Since the search region contains 
only a portion of the signal from that particular source, 
a single searcher is a poor match, and so other searchers 
soon are recruited to fix- up the fit. Figure [5] shows a case 
of slamming, where four searchers (in a data snippet with 
four sources) are drawn off to the edge of the search re- 
gion by a bright source just outside the search region. 
A tell-tale feature of slamming is the large amplitudes 
of the searchers and their high degree of correlation or 
anti-correlation. 

One fix to the slamming problem is to weight the 
matches in the wings less than matches in the acceptance 
window. We do this by attenuating the contribution from 
the wings in the noise weighted inner product. The atten- 
uation is done by increasing the noise spectral density in 
the wings, which we call "wing noise." We exponentially 
increase the noise spectral density, starting at the edge 
of the acceptance window, as shown in Figure |31 This 
addition to the algorithm is quite effective at lessening 
the impact on the likelihood of a searcher fitting power 
bleeding into wings of the search region. Figure [5] shows 
a search using wing noise in the same region as Figure [S] 
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As can be seen there is no slamming, and the searchers 
were able to find the true sources. 

Another fix to the slamming problem is to use infor- 
mation about bright sources that lay outside the search 
region, but close enough that their signals overlap with 
the search region, in the search process (e.g. subtracting 
the signal of such sources from the data snippet). This 
requires knowledge of the bright sources, but such knowl- 
edge can be gained by performing initial, exploratory 
searches for a few sources in the area surrounding the 
search region in question. As one is free to choose the 
number of searchers and the ranges of the search, one 
can use the BAM algorithm hierarchically, if desired, to 
seek out the brightest sources in a pilot search of the data. 
Information gleaned from the pilot searches can then be 
used to subtract bright sources that might lead to slam- 
ming. This hierarchical approach will be described in 
more detail in the Section IIII Al In general use of our 
BAM algorithm, we use both wing noise and information 
about bright sources outside the search regions to lessen 
the chance of slamming. 



cc 
-z. 



100 



10 



Search Region 



1 

2.999 



Acceptance Window 

True Sources 




3.001 3.002 
Frequency 



3.003 



3.004 



FIG. 6: The same search shown in Figure [5l but with the 
inclusion of wing noise. The wing noise has cured the slam- 
ming. 
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FIG. 5: An example of slamming, in which four searchers are 
drawn to the edge of the search region by a bright source 
placed just outside the search region. 



D. Time Scaling 

Since there are expected to be 15,000+ resolvable 
galactic binary systems in the LISA data, we need to be 
sure that any algorithm developed will be able to search 
for such a large number of sources in a practical period 
of time. A pure BAM algorithm that employed a meta 
template that is the sum of templates from individual 
sources and simple proposal updates would have a com- 
putational cost that scaled linearly in the the number 
of sources since the cost per source would be constant. 
The goal of a purely linear scaling is unrealizable in the 
current version of the BAM algorithm for two reasons. 



First, in using the F-statistic to calculate the likeli- 
hood values we are required to solve equation (|13p for the 
time-independent amplitudes a^. Inverting this equation 
would scale as N'^ ^ and while solving for the amplitudes 
using an iterative method is significantly better it does 
not scale linearly with N . The next update to the al- 
gorithm will be to discontinue use of the F-statistic and 
return to a seven parameter (per binary system) search. 

The second impediment is that some proposal distribu- 
tions, such as a multivariate normal distribution jumping 
along the eigendirections of the variance-covariance ma- 
trix are inherently non-linear in computational cost (cal- 
culation of the Fisher Information Matrix alone scales 
as A^^). It is here that the use of the blocks shows 
great benefit. Figure [7] shows how the time to perform a 
searcher update is affected by the number of searchers. 
The plot shows the average cost per individual searcher 
that is updated in a search using only multivariate nor- 
mal proposal distributions. In one case, all searchers are 
updated simultaneously (i.e. the entire search region is 
treated as one block). In the other case, breaking the 
updates into the blocks lessens the number of searchers 
being updated simultaneously, greatly reducing the over- 
all computational cost as the number of sources grows. 
The search was performed on a one year data stream in 
a O.OlmHz snippet starting at frequency of SniHz with 
blocks that were 4/year wide. Up through iV = 20, 
no blocks contained more than one search template and 
the increase in computational cost per search template 
update was just 1.25 times greater than a single tem- 
plate search, due solely to the use of the multi-source 
F-statistic. Beyond = 20 the blocks began to contain 
more than one searcher, and the non-linear nature of the 
normal proposal started to drive up the cost per update. 
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FIG. 7: Plot of the average computational time per update 
per search template for two cases, one where all sources are 
updated simultaneously, and the other where block updates 
are used. The computational time (y-axis) has been scaled 
by the average time per update for a single source search. 
The increase in cost per template update grows much more 
slowly for the blocked search. To achieve linear scaling in the 
total cost of the BAM algorithm we would need a constant 
computational cost per individual template update. 



III. ISSUES WHEN USING THE ALGORITHM 

Thus far we have described the features of the BAM 
algorithm, and discussed how those features aid in op- 
timizing the algorithm. In this section some of the im- 
plementation issues will be addressed. We will discuss 
choices of the sizes of search regions, the sizes of the 
wings, hierarchical searching, stopping criteria, and false- 
positive/false-negative levels. 



A. Hierarchical Method 

We are able to perform searches in a hierarchical man- 
ner simply by choosing to use less search templates than 
their are sources. For example, if we search a region that 
contains 10 sources with a meta-template built from 2 
galactic binary templates, the algorithm will generally 
recover the 2 sources that return the highest likelihood 
value (usually they will correspond to the highest SNR 
sources in the region). In this hierarchical approach the 
solution for the brightest sources will be thrown off by 
the sources that were neglected, but this bias can be cor- 
rected at subsequent stages in the hierarchy. At the next 
stage in the hierarchy more galactic binary templates are 
used, with the information from the earlier searches pro- 
viding starting locations for some of the search templates. 

Figure [8] describes an example where a 1000/m data 
snippet containing 281 sources with a observation time of 
one year was searched using 10, 30, and 50 searchers. In 
the case using 10 searchers, the 10 sources with the high- 



est SNRs were found, while for the case with 30 searchers 
only 25 of the 30 highest SNR sources were among those 
found (the other 5 searchers did find true sources, albeit 
with lower SNRs). 
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FIG. 8: Plot of the recovered (and true) parameter frequencies 
and their respective SNRs for the sources recovered by the 
BAM algorithm searching a 1000/m data snippet containing 
281 sources with a 1 year time of observation using 10, 30, 
and 50 searchers. 

One may be tempted to subtract the signals of sources 
found in the pilot search, before looking for more sources 
in the region, but our experience leads us to believe that 
this is not the best way to proceed. In a hierarchical 
search the initial solution for the bright sources will be 
thrown off by dimmer sources that they overlap with. To 
stop these errors from propagating it is better to allow 
the algorithm to refine the fit to the bright sources at 
subsequent stages in the hierarchy. 

Instead of removing the signal from the data stream, 
we include the information in the meta-template rep- 
resented by the combination of the search templates. 
Within the framework of our algorithm one has several 
options of how to include the information at subsequent 
stages. One option is to begin the next search with some 
of the search templates at the source locations deter- 
mined at the previous iteration, then allowing them to 
behave like any other searchers from that point on. An- 
other option is to hold the intrinsic parameters of the 
previously determined signals fixed during the high heat 
portion of the annealing phase (the extrinsic parameters 
of all templates are still updated since we are using a 
multi-template F-Statistic). The advantage of the later 
approach is that it protects the information gleaned from 
earlier stages in the hierarchy, as there is a danger that a 
previously determined template will be dislodged during 
the annealing phase. When the temperature reaches a 
selected value, the constraints are withdrawn and the in- 
herited search templates get treated like any other. This 
allows the information gathered in the earlier runs to be 
preserved, and also allows for adjustments to be made to 



8 



the parameters as needed due to the inclusion of other 
searchers. 

Another benefit is that such hierarchical searching can 
be used to improve efficiency. Knowing where the very 
bright sources are in the data snippet can be used to 
lessen the sizes of the wings of the search regions. This 
will be discussed in more detail in the next subsection. 



B. Choosing Window Sizes 

There are several factors that influence the choice of 
window size. In Subsection III Dl we saw that the cost per 
template update grows with the number of templates in 
a window. This argues for using small search windows. 
However, in Subsection lll Cl we saw that it was necessary 
to include a wing region to mitigate edge effects, and any 
signals recovered in the wings had to be discarded. Thus, 
we would like for the wings to cover a small fraction of the 
search window, so that the fraction of templates assigned 
to wings is small. The size of the wings is determined by 
the typical bandwidth of a source. A natural choice is to 
set the wing size equal to the typical half-bandwidth as 
this ensures that little power leaks into the acceptance 
window. The source bandwidth is determined by two 
factors, the width of the sideband comb: 

Be«2 f5 + 27r/— -j /„ (16) 

and the degree of spectral leakage caused by the finite 
observation time. The spectral leakage depends on the 
amount the carrier frequency differs from being an integer 
multiple of 1 /Tobs, and falls off inversely with the number 
of frequency bins. For observation times on the order of 
years and for frequencies in the milli- Hertz range, the two 
contributions to the bandwidth are of similar size. Some- 
what larger wings are needed if very bright sources have 
not been regressed from neighboring frequency windows. 

These physical considerations dictate a total wing size 
of ~ 15 30/m, and we would like to use snippets 
considerably larger than this in order to maximize the 
fraction of the templates that appear in the acceptance 
window. The problem with this is that in regions of high 
source density we end up with template densities of 1 per 
^ 4 frequency bins, so for a 1 year data stream the wings 
alone account for 4 — s- 7 templates. Thus, to keep us in 
the regime where the cost per update is close to that of 
a single source, we were forced to use snippets where the 
acceptance window was comparable in size to the wings. 
In future upgrades we hope to improve the scaling of the 
algorithm so that we can use larger search windows. 



C. Stopping Criteria 

In LISA data analysis we will not only have to deter- 
mine source parameters, but also the number of sources 



that can be resolved. Studies of the galactic confusion 
noise levels [1, H, H, 0, IHj El] provide some answers con- 
cerning source populations across the LISA band, as well 
as estimates of the number of resolvable binaries as a 
function of frequency [7,]. However, this information is 
better suited to determining the number of source tem- 
plates to start with, than the number with which to end. 
In order to discover where we must stop we will have to 
listen to the data. 

Models with more parameters will always produce bet- 
ter fits to the data, but beyond a certain point the recov- 
ered parameters become meaningless. What we seek is 
the best fit to the data with the simplest possible model. 
For a model with uniform priors we seek to minimize 

= - h\s - h) , (17) 

using the smallest number of source templates. Using 
Fisher matrix techniques it can be shown that the expec- 
tation value of this quantity is given by 

{X^)=N-D, (18) 

where M is the total number of data points and D is the 
model dimension. As one might have anticipated, it does 
not make sense to use a model with more parameters than 
there are data points. This sets an upper limit of 1 source 
template every 2 frequency bins, as there are 4 data 
points per frequency bin (2 independent data channels, 
each with a real and imaginary part), and 7 parameters 
per template. More refined criteria, such as the Bayesian 
evidence, set more stringent stopping criteria. 

There have been many different suggestions of how to 
weight goodness of fit against model complexity. Two in 
common use are the Akaike Information Critera (AIC) 
and the Bayesian Information Criteria (BIC) [47,]. We 
tried both, and found neither to be particularly satisfac- 
tory, settling instead on the Laplace approximation to the 
full Bayesian evidence. The evidence px{s) for a model 
X given data s is given by the integral 

Px{s) - j p{s\\,X)p{\X)dX. (19) 

Computing this integral is extremely expensive for high 
dimension models, but the Laplace approximation pro- 
vides the estimate: 

Px(s)~p(s|Aml,X) (^^) , (20) 

where p{s\Xm'l,X) is the maximum likelihood for the 
model, Vx is the volume of the model's parameter space, 
and AVx is the volume of the uncertainty ellipsoid 
(which we estimate using a Fisher Information Matrix). 
In general, adding another source template to the model 
will increase the likelihood, however, the AVx /Vx term 
penalizes larger models and serves as a built in Occam 
factor. 

As an example we will look at a data snippet containing 
4 sources. The source parameters and SNRs are shown in 
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TABLE I: Source Parameter for the 4 sources in the data 
snippet 



SNR 


A (10-24) 


/ (mHz) 


6 






L 


fo 


1.23 


2.05 


3.001514406 


1.259 


4.012 


0.759 


1.183 


2.551 


7.67 


10.8 


3.000315843 


2.437 


2.753 


2.484 


2.173 


2.550 


9.65 


12.1 


3.000454748 


2.198 


0.422 


2.880 


2.263 


2.991 


10.76 


9.27 


3.001985584 


1.336 


5.863 


0.931 


2.805 


4.048 



Table m There is one dim source which is not expected to 
be recovered (see Section llll DI for more details). Figure[9] 
plots the logarithms of the evidence and likelihood for 
models of increasing size. One can see that the evidence 
is peaked at the model with 3 source templates, while 
the likelihood continues to climb as the model dimension 
is increased. This tells us that the data favors a model 
with 3 sources over one with 4. 





abs( log (evidence) ) 




log (likelihood) 
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FIG. 9: Plot of the magnitude of the logarithms of the likeli- 
hood and evidence for increasing numbers of searchers, search- 
ing a data snippet with sources whose parameters are listed 
in Table H] injected into it. 

An alternative approach to model selection is to use the 
Reverse Jump MCMC algorithm, which allows for tran- 
sitions between models of different dimension. The frac- 
tion of the time the chain spends exploring each model 
is used as a measure of the relative evidence for the dif- 
ferent models. We plan to use the RJMCMC approach 
in future versions of the algorithm. 

One limitation of the way we have formulated our 
Bayesian model selection is that the evidence for a zero 
source model is ill defined, so we cannot compare models 
with and 1 source templates. This limitation can be re- 
moved if we expand our models to include the instrument 
noise parameters, which we plan to do in future versions 
of the algorithm. 

For those that favor the Frequentist approach to data 
analysis we provide some rough estimates of the false 
alarm and false dismissal rates in the following subsec- 
tions. 



D. False Negatives 

Here we study the detection rate for the BAM algo- 
rithm as a function of the SNR of an isolated source. To 
this end we performed searches of data snippets between 
3mHz and 3.031688mHz (a 1000/™ segment of the LISA 
band), each containing a single source. A set of parame- 
ters for 100 sources were chosen at random. Keeping all 
other parameters constant, the amplitudes were varied to 
increase the SNR. Simulated data sets for each of the 100 
sources were created at different SNR levels. The BAM 
algorithm was then applied to the simulated data using 
short chains of 15, 000 steps (10, 000 steps in the anneal- 
ing phase and 5,000 set in the sampling phase of each 
run). The searches were also run using longer chains, 
consisting of 75, 000 steps (50, 000 annealing/25, 000 sam- 
pling) . For each of two types of chains two analysis meth- 
ods were used to determined the parameter values. In 
one method the parameter values were determined by 
using average values of the parameters in the chain from 
the sampling phase {i.e. Bayes estimates). In the sec- 
ond method the parameter values were determined from 
the mode of the parameter histograms from the sampling 
phase {i.e. MAP estimates). Figure fTOl shows the de- 
tection probabilities for these searches based on the two 
analysis methods. The resulting parameter set is called 
a detection when they deviate by less than 5 — cr in each 
of the true source's intrinsic parameters (for more discus- 
sion on this cut-off see Section HV)) . 

As expected, the detection probability depends on the 
length of the chain, with the longer chains giving higher 
detection rates. However, a single very long chain is not 
the best way to ensure detection. Consider the search 
for sources with SNR = 5 using the MAP parameter 
estimates. The detection rate for the shorter chains were 
0.45, while for the the longer chain the rate was 0.57. 
However, if the search for a single SNR = 5 source is 
repeated multiple times, the detection rate also comes 
out at ^ 0.45. Thus, if the short chain search is run 
twice and the results from the two runs are combined, 
the detection probability improves to ~ 1 — 0.55^ — 0.7, 
at a total cost of 30,000 steps, which is still less than half 
the cost of the long chain searches. 

Of the two methods for determining the source param- 
eters, the MAP performed better than the Bayes estimate 
in the low SNR region. On the other hand, the MAP 
is harder to compute, especially when there are a large 
number of sources, so the current implementation of the 
full scale BAM algorithm still uses Bayes estimates. This 
will be corrected in future versions. 



E. False Positives 

We now turn our attention to computing the false 
alarm rate by searching data streams that contain only 
instrument noise. We employed two approaches: in the 
first we perform multiple searches with differing noise re- 
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FIG. 10: Probability of detection of a source using the BAM 
algorithm as a function of source SNR. 

alizations and in the second we performed an extended 
search of one noise reahzation. 

For the first test we performed 20, 000 searches using 
a single search template in the same frequency band we 
used to study the false negatives. Figure [11] shows a 
histogram of the SNRs for the finishing points of the 
searches. This can be used to give an idea as to the 
level where false positive begin to become a concern. In 
this frequency range, the histogram tells us that more 
than 99% of the searches ended on parameters leading 
to a SNR less than 5. So in accepting any result from a 
search with SNR above 5 in this regime there is roughly a 
1% chance of accepting a false positive, with the probabil- 
ity dropping precipitously for searchers returning higher 
SNRs. We repeated the analysis at different frequencies 
and found the false positive level to be much the same 
across the LISA frequency band. 

0.035 I . . . . . . 1 



but now a single long chain of one million steps was 
performed. Figure [T^ shows the frequency parameter 
over 100, 000 steps in the chain. One notices that the 
chain does not lock in on any particular frequency, rather 
it continually wanders about the entire search region. 
While this is common, it is not always the case, as in- 
strument noise can sometimes mimic a monochromatic 
source well enough to slow or even stop such exploratory 
movement in a chain. Two other reliable indicators of a 
false positive are the amplitude and the inclination angle 
of the binary system. Figures [T2] and [H] show their re- 
spective histograms. The cosine of the inclination angle 
is peaked around zero, and the amplitude is peaked at 
a level that gives templates whose spectral density has 
the same magnitude as the noise. The reason for these 
preferences is that they allow the template to optimally 
match the noise in the two LISA data channels. An incli- 
nation angle of 7r/2 gives equal weight to the two chan- 
nels, which allows the template to match the noise level 
in both channels equally well. 



3.035 
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FIG. 12: Plot of the frequency steps in a chain searching 
source-free noise in a 1000/m band starting at 3mHz. 




FIG. 11: Histogram of the SNRs for searches on source-free 
noise in a 100/m band starting at 3mHz. 

The second test used the same 1000/™ band at 3mHz, 



To summarize, if the BAM algorithm recovers a source 
with SNR > 5 it is unlikely to be a false positive. More- 
over, the chains show some very characteristic features 
when the templates are just matching noise, and these 
features can be used as a diagnostic to exclude false pos- 
itives. Lastly, one can always run the search multiple 
times, and if the same same set of parameters are re- 
covered over and over again it is less likely that we have 
found a false positive. 



IV. EXAMPLE SEARCHES 

In this section we show the results of several types 
of searches performed with the BAM algorithm. While 
results for single source searches are easy to describe, 
when the search is for hundreds of sources, and there 
are hundreds of successes, the numbers are much harder 
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FIG. 13: Histogram of the cosine of the inclination angle for 
a chain searching source-free noise in a IQQQ/m band starting 
at 3mHz. 
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FIG. 14; Histogram of the amplitude parameter for a chain 
searching source-free noise in a IQQQ/m band starting at 
3mHz. 



to present in a condensed manner. To that end, we will 
present multiple search cases using plots like those in Fig- 
ure [51 where the frequency values are shown, and we will 
list the deviations of the recovered intrinsic parameters 
scaled by the uncertainties given by the Fisher Informa- 
tion Matrix for the sources. Since we are mostly inter- 
ested in the search phase of the algorithm the post-search 
MCMC runs we chosen to be rather short, so the recov- 
ered posteriors are a little ragged. With this in mind we 
set our cut-off criteria for "finding" a source at a devia- 
tion of 5 — (T from any one of the true source's intrinsic 
parameters. In practice this cut-off was fairly robust as 
any template that did not have all parameters within a 
few a of the source typically had one or more parameters 
that were tens to hundreds of a out. We also imposed 
was a SNR minimum of 5 in addition to the Bayesian ev- 
idence criteria. This was perhaps stricter than necessary 



to keep down the possibility of a false positive (and in fact 
did lead to one actual detection being discounted), but 
with that cut-off there were no instances of false positives 
in any of the searches presented here. 



A. Searches of the Mock LISA Data Challenge 
Training Sets 

The Mock LISA Data Challenge (MLDC) consists of 
several types of challenges for the LISA data analysis 
community to test search algorithms using simulated 
LISA data. The first round in the MLDC consists of 
challenges for three source types: galactic binaries, su- 
permassive black holes, and extreme mass ratio inspirals. 
For this work we will focus on the challenges dealing with 
galactic binaries. Provided with each challenge are two 
data sets, one is a blind test where the source parameters 
are unknown, while the other is an open test where the 
source parameters are provided so that one may synchro- 
nize conventions. In this paper we will only be discussing 
results from the open data sets, as the MLDC Taskforce 
has asked that all results for the blind challenges be em- 
bargoed until December 2006. 



1. Single binary searches 

The first challenges consist of single binary systems 
injected into a LISA data stream with instrumental noise. 
In the Challenge 1.1.1 there are three tests. The first is a 
monochromatic binary with frequency, / = l.OiO.lmHz, 
the second has a frequency of / = 3.0 ± O.lmHz, and the 
third a frequency of / = 10.0± l.OmiJz. While the BAM 
algorithm is designed to handle multiple source searches 
these single source challenges provide a means to test our 
conventions as well check for any modeling errors. One 
source of modeling error is that we used a low frequency 
detector response model, which restricts us to searching 
for signals below 7 niHz. Thus we have only performed 
searches on the first two of these single source challenges. 

In the ImHz case the true source parameters and the 
results from the BAM algorithm are given in Table [TTl 
For this work all uncertainties will be calculated using a 
Fisher Information Matrix at the parameter values given 
by the chain. A longer run of the data chains after burn- 
in could also provide a means to calculate these uncer- 
tainties, but as was shown in [17] for the intrinsic param- 
eters these will be a very good match to those from the 
Fisher calculation. Here the deviations (or discrepancies) 
between the true parameters value of the three intrinsic 
parameters and the values determined through the search 
are less than la: A/ = [ftme - f search) = -0.7530(7/, 
M = -0.5563(70, and A0 = -0.20498(7^. 

For the 3mHz case the search returned mean pa- 
rameter values that were also discrepant from the true 
source parameters by less than la (A/ = 0.2164(7/, 
M = -0.09466(70, A0 = -0.7860(7^). 
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TABLE II: Results of a search of the MLDC Training Data Set 1.1.1a 





A (10-22) 


/ (mHz) 


6 







t 


<po 


True Parameters 
Recovered Parameters 
Parameter Uncertainties 


1.789229908 
2.364 
0.3876 


0.9930348535 
0.993034139 
9.493e-07 


0.4741143268 
0.4575 
0.02983 


5.19921 
5.196 
0.01754 


3.975816 
4A75 
0.3126 


0.1793956 
0.7364 
0.2095 


5.781211 
4.832 
0.6234 



In these two searches the algorithm is performed ex- 
actly as expected. This suggests that our current im- 
plementation of the algorithm is free of any significant 
systematic errors in either waveform generation or cal- 
culation of the likelihood values, as the MLDC data was 
created with an independently developed code. 



2. Low Source Confusion 

Challenge 1.1.4 for the MLDC is a test for algorithms 
in the low source confusion regime, where the source den- 
sity is 1 source per 10/m. Results for a search of the 
training data set are presented here. 

Using only the frequency range given in the challenge 
(S.OOOmHz < f < 3.015mHz), the BAM was run on the 
data stream (an approximate range of the number of 
source was given in the challenge, and indeed the exact 
number in the training data is known to be 50, but that 
information was not used in directing the algorithm). 
The log evidence was used as the stopping criteria for 
each search region. The frequency range of the data snip- 
pet was divided into 20 search regions such that each ac- 
ceptance window was 0.75/xHz in width, and the width 
of the wings were 0.1 times the bandwidth of a typical 
source with a frequency in the search region. 

Figure [T5] shows a plot of the locations in frequency 
of the 50 individual sources in the data snippet. The 
heights of the bars show the SNR of each source, while 
the width of the bars is Ifm- Results of the search are 
similarly expressed in Figure \W[ As these two plots are 
very similar Figured?] has been provided to highlight the 
difference between the them. 

The five sources shown in Figure [T71 represent the false 
negatives of this particular search. Three of these sources 
had SNRs < 5 and thus were not expected to be recov- 
ered given the results of subsection IIIIDI The remain- 
ing two unrecovered sources had SNRs < 5.6. Figure [TH] 
shows how well the search algorithm fit the true source 
parameters. It is a histogram of the differences in the 
135 intrinsic parameters of the recovered sources in units 
of their respective variances (as calculated via a Fisher 
Information Matrix located at the recovered parameter 
values). Nearly 92% of the parameters recovered by the 
algorithm differed from their true values by less than 2a. 
This fit might be improved some with a so-called 'fin- 
isher' step, which will be briefly discussed in more detail 
in the following subsection. 
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FIG. 15: Plot of the true parameter frequencies and their 
respective SNRs for the sources injected into a LISA data 
stream for Training data set 1.1.4 of the Mock LISA Data 
Challenge. 
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FIG. 16: Plot of the recovered parameter frequencies and 
their respective SNRs for the sources recovered by the BAM 
algorithm searching Training data set 1.1.4 of the Mock LISA 
Data Challenge. 



3. Strong Source Confusion 

Challenge 1.1.5 for the MLDC is a test for algorithms 
in the high source confusion regime, where the source 
density is 1 source per 2.5/m. Results for a search 
of the training data set are presented here. 

The BAM algorithm was run on the data streams using 
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FIG. 17: Plot of the true parameter frequencies and their 
respective SNRs for the sources that were not recovered by 
the BAM algorithm searching Training data set 1.1.4 of the 
Mock LISA Data ChaUenge. 
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FIG. 18: Histogram of the discrepancies between the true in- 
trinsic source parameters and the intrinsic parameters recov- 
ered by the BAM algorithm searching Training data set 1.1.4 
of the Mock LISA Data Challenge. Differences are given in 
units of the parameter variances. 



the frequency range given in the challenge (2.9985mHz < 
/ < 3.0015mHz). Again, the approximate range of the 
number of sources that was given in the challenge was 
not used in directing the algorithm. The log evidence 
was used as the stopping criteria for each search region. 
The frequency range of the data snippet was divided into 
10 search regions such that each acceptance window was 
0.3^Hz in width, and the width of the wings were 0.5 
times the bandwidth of a typical source in the search 
region, BW = 28/m. The wing size of this search is con- 
siderably larger than was used in the previous example. 
Initial runs on this data set with smaller wings showed 
signs of slamming, so the wing size was increased. 

Figure [m shows a plot of the locations in frequency of 



the individual sources in the data snippet. The heights 
of the bars show the SNR of each source, while the width 
of the bars is Ifm- The density of sources is even higher 
than first appears in the plot, however, since four sources 
share an identical frequency (2.998999384mHz) as do 
three other pairs of sources (3.000085802, 3.000629082, 
and 3.001173008mHz). 

Results of the search are displayed in FiguredOl As can 
be seen in the plot, the extremely high density of sources 
prevent the algorithm from recovering all of the sources. 
Of the 44 sources injected into the data stream, 27 were 
recovered, which corresponds to a recovered source den- 
sity of 1 per 4 frequency bins. Figure [22] shows a his- 
togram of the differences in the 81 intrinsic parameters 
of the recovered sources in units of their respective vari- 
ances (as calculated via a Fisher Information Matrix lo- 
cated at the recovered parameter values). The spread 
of the discrepancies for the recovered sources is larger 
than those for the case of low source confusion shown in 
Figure [181 Just over 76% percent of the parameters re- 
covered by the algorithm differed from their true values 
by less than 2a. This departure from the Fisher matrix 
predictions is due to the additional confusion noise from 
unrecovered sources. 

Figure [51] shows the sources that were not recovered 
by algorithm. Of the 17 unrecovered sources, 13 had a 
nearest neighbor that was within 1/m, including 5 of the 
unrecovered sources that shared an identical frequency 
with at least one other source. Sources that are close in 
frequency are much more likely to be highly correlated 
than those that are well separated (e.g. the brightest of 
the sources at / = 2.998999384mHz anti-correlates with 
two of the other sources at that frequency at the level of 
—0.81 and —0.67). This high density and corresponding 
high levels of correlation introduces two problems for the 
current implementation of the BAM algorithm. First, is 
that analysis of the chains was done using the mean val- 
ues of a particular search template. With nearby sources, 
the individual searchers can jump between the sources 
and the calculated mean value is a weighted mean of the 
two, or more, close sources (weighted by the number of 
steps in the chain spent at each source). In the next im- 
plementation of the algorithm we combine the all the pa- 
rameter chains of a given type into a single histogram and 
use standard spectral line fitting software to fit the com- 
bined PDF by multiple Gaussians. Second, the current 
implementation of the algorithm includes a "blast" pro- 
posal distribution that separates highly anti-correlated 
sources {k < —0.9). This proposal was included to lessen 
the effect of slamming (by performing a uniform draw 
on one of the two anti-correlated searchers). With the 
inclusion of the wing noise and returning to a 7 param- 
eter search (per template) , this proposal will most likely 
not be needed. This should allow the search templates 
to spend more time in areas with highly anti-correlated 
sources. 

Lastly, the fit could be improved using a 'finisher' step 
in the analysis process. While the full details of such a 
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finisher are beyond the scope of this paper, one can think 
of it as continuing the search algorithm using proposals 
specific to providing efficient mixing of the chain (such as 
a drawing from a multivariate gaussian distribution) that 
will allow for searchers to work through the issues created 
by the high levels of correlation and reach the posterior 
distribution. Also, in this step sources can be introduced 
to the fit given by the BAM algorithm to provide a better 
fit. 
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FIG. 19: Plot of the true parameter frequencies and their 
respective SNRs for the sources injected into a LISA data 
stream for Training data set 1.1.5 of the Mock LISA Data 
Challenge. 
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FIG. 20: Plot of the recovered parameter frequencies and 
their respective SNRs for the sources recovered by the BAM 
algorithm searching Training data set 1.1.5 of the Mock LISA 
Data Challenge. 
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FIG. 21: Plot of the true parameter frequencies and their 
respective SNRs for the sources that were not recovered by 
the BAM algorithm searching Training data set 1.1.5 of the 
Mock LISA Data Challenge. 
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FIG. 22: Histogram of the discrepancies between the true in- 
trinsic source parameters and the intrinsic parameters recov- 
ered by the BAM algorithm searching Training data set 1.1.5 
of the Mock LISA Data Challenge. Differences are given in 
units of the parameter variances. 



B. A Large A'^ Search for Resolvable Binaries 

In this subsection we will discuss the results of a search 
for sources in a 1000/,„ data snippet at 3mHz. These 
sources were chosen from a galactic model described by 
Nelemans et al @. For this realization there are 281 
sources in the data snippet. 

Unlike the previous searches, the observation time is 
3 years. This provides a test of the algorithm for multi- 
year observation times, and models a search through a 
non-trivial section of the galactic background (nearly 1% 
of the overall frequency range, and > 1% of the expected 
resolvable sources). With a three year observation time 
all but four of the sources have a SNR > 5. Figure [23] 
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shows a histogram of the SNRs for sources befow 100 
(another 9 sources have SNRs > 100). 

Figure [Ml shows a plot of the focations in frequency of 
the individual sources in the 1000/m data snippet. The 
heights of the bars show the SNR of each source, while 
the width of the bars is 1/m- The results of the search 
are shown in Figure [23 As these two plots are very 
similar, Figure [26] has been provided (and re-scaled) to 
highlight the difference between the them. The search 
was able to find 265 of the 281 sources, with all but 3 of 
the unrecovered sources having a SNR < 6. One of these 
"unrecovered" sources was an instance of the SNR cut-off 
inadvertently omitting a detected source (the source at 
f = 3.02634519 has a SNR = 4.50, and it was recovered 
with a SNR = 4.62). It is interesting to note that these 
results are consistent with the prediction in Section [ill Dl 
regarding the rate of false negatives. This suggests that 
most of the 12 sources that were not recovered and have 
a SNR > 5 might be found with repeated searches. 

Figure [27] shows a histogram of the differences in the 
795 intrinsic parameters of the recovered sources in units 
of their respective variances (as calculated via a Fisher 
Information Matrix located at the recovered parameter 
values). Slightly more than 91% of the parameters re- 
covered by the algorithm differed from their true values 
by less than 2a. 
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FIG. 24: Plot of the true parameter frequencies and their 
respective SNRs for the sources injected into a LISA data 
stream for 281 sources drawn from a galactic distribution with 
a 3 year time of observation. 
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FIG. 25: Plot of the 270 recovered parameter frequencies and 
their respective SNRs for the sources recovered by the BAM 
algorithm searching a 1000/m data snippet containing 281 
sources with a 3 year time of observation. 



FIG. 23: Partial histogram of the SNRs for the true source 
parameters of a 1000/m data snippet containing 281 sources 
with a 3 year time of observation. 



V. CONCLUSION 

We have developed and tested an algorithm that is 
capable of locating and characterizing galactic binaries 
across the entire LISA band. In regions of strong source 
confusion we found that the algorithm could recover 1 
source every 4 frequency bins. We found that the al- 
gorithm performs very well on snippets taken from a full 
galactic foreground model, and since the BAM algorithm 



develops a global solution by sewing together searches 
over subsets of the LISA data, completing the full analy- 
sis of the galactic simulation is just a matter of computer 
time. 

While the current algorithm is very effective, we iden- 
tified many improvements and extensions that are now 
being implemented. The waveform modeling has been 
updated to include the full LISA response and frequency 
evolution, and we have reverted to performing full pa- 
rameter searches due to the computational cost incurred 
by the multi-source F-statistic. Work is currently in 
progress to extend the search to include parameters that 
describe the noise in each data channel. This extension 
will be particularly important below 3 mHz as the effec- 
tive noise level will be set by unresolved sources, so we 
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FIG. 26: Plot of the 16 true parameter frequencies and their 
respective SNRs for the sources that were not recovered by the 
BAM algorithm searching a 1000/m data snippet containing 
281 sources with a 3 year time of observation. 



will not know in advance what weighting to use in the 
inner products. Our current stopping criteria using the 
Laplace approximation to the Bayes evidence could be 
eliminated if we switch to a transdimensional Reverse- 
Jump MCMC method ;49'|. The analysis of the post- 
search chains can also be improved using spectral line 
fitting techniques. 



Even with our current algorithm we estimate that it 
would take less than two weeks to process a full galactic 
foreground on a 3 GHz, 128 node cluster. With the mod- 
ifications we are implementing we expect both the speed 
and fidelity of the algorithm to be much improved. We 
will have an opportunity to test the updated BAM algo- 
rithm on a full galactic simulation in the second round 
of Mock LISA Data Challenges, which are set for release 
in December 2006. 
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FIG. 27: Histogram of the discrepancies between the true in- 
trinsic source parameters and the intrinsic parameters recov- 
ered by the BAM algorithm searching a 1000/m data snippet 
containing 281 sources with a 3 year time of observation. Dif- 
ferences are given in units of the parameter variances. 
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